Collective Ontology-based Information Extraction using Probabilistic Graphical Models

نویسنده

  • Slavko Zitnik
چکیده

Information Extraction (IE) is a process of extracting structured data from unstructured sources. It roughly consists of subtasks named entity recognition, relation extraction and coreference resolution. Researchers have primarily focused just on one subtask or their combination in a pipeline. In this paper we introduce an intelligent collective IE system combining all three subtasks by employing conditional random fields. The usage of same learning model enables us to easily communicate between iterations on the fly and to correct errors during iterative process execution. In addition to the architecture we introduce novel semantic and collective feature functions. The system’s output is labelled according to an ontology and new instances are automatically created during runtime. The ontology as a schema encodes a set of constraints, defines optional manual rules or patterns and with instances provides semantic gazetteer lists. The proposed framework is being developed during ongoing PhD research. It’s main contributions are intelligent iterative interconnection of the selected subtasks, extensive use of context-specific features and parameterless system that can be guided by an ontology. Some preliminary results combining just two subtasks already show promising results over traditional approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proceedings Template - WORD

Most of the approaches for dealing with uncertainty in the Semantic Web rely on the principle that this uncertainty is already asserted. In this paper, we propose a new approach to learn and reason about uncertainty in the Semantic Web. Using instance data, we learn the uncertainty of an OWL ontology, and use that information to perform probabilistic reasoning on it. For this purpose, we use Ma...

متن کامل

Rule-based joint fuzzy and probabilistic networks

One of the important challenges in Graphical models is the problem of dealing with the uncertainties in the problem. Among graphical networks, fuzzy cognitive map is only capable of modeling fuzzy uncertainty and the Bayesian network is only capable of modeling probabilistic uncertainty. In many real issues, we are faced with both fuzzy and probabilistic uncertainties. In these cases, the propo...

متن کامل

Relational Markov Networks for Collective Information Extraction

Most information extraction (IE) systems treat separate potential extractions as independent. However, in many cases, considering influences between different potential extractions could improve overall accuracy. Statistical methods based on undirected graphical models, such as conditional random fields (CRFs), have been shown to be an effective approach to learning accurate IE systems. We pres...

متن کامل

Collective Information Extraction with Relational Markov Networks

Most information extraction (IE) systems treat separate potential extractions as independent. However, in many cases, considering influences between different potential extractions could improve overall accuracy. Statistical methods based on undirected graphical models, such as conditional random fields (CRFs), have been shown to be an effective approach to learning accurate IE systems. We pres...

متن کامل

A Note on Methodology for Designing Ontology Management Systems

In this note we propose a novel methodology for designing Ontology Management Systems architecture, which grounds on an ontology representation based on probabilistic Graphical Models. By discussing about troubles with ontology as tool for managing knowledge, formal assumptions about semantics definition and representations rise, turning out an original architecture that will be presented and dis-

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012